Gentles ef al. Systematic Reviews 201 3, 2:95 
http://www.systematicreviewsjournal.eom/content/2/1/95 



3 SYSTEMATIC 

| REVIEWS 



RESEARCH Open Access 



Factors explaining the heterogeneity of effects of 
patient decision aids on knowledge of outcome 
probabilities: a systematic review sub-analysis 

Stephen J Gentles 1,4 " Dawn Stacey 2 , Carol Bennett 3 , Mohamad Alshurafa 1 and Stephen D Walter 1 



Abstract 

Background: There is considerable unexplained heterogeneity in previous meta-analyses of randomized controlled 
trials (RCTs) evaluating the effects of patient decision aids on the accuracy of knowledge of outcome probabilities. 
The purpose of this review was to explore possible effect modification by three covariates: the type of control 
intervention, decision aid quality and patients' baseline knowledge of probabilities. 

Methods: A sub-analysis of studies previously identified in the 201 1 Cochrane review on decision aids for people 
facing treatment and screening decisions was conducted. Additional unpublished data were requested from 
relevant study authors to maximize the number of eligible studies. RCTs (to 2009) comparing decision aids with 
standardized probability information to control interventions (lacking such information) and assessing the accuracy 
of patient knowledge of outcome probabilities were included. The proportions of patients with accurate knowledge 
of outcome probabilities in each group were converted into relative effect measures. Intervention quality was 
assessed using the International Patient Decision Aid Standards instrument (IPDASi) probabilities domain. 

Results: A main effects analysis of 17 eligible studies confirmed that decision aids significantly improve the 
accuracy of patient knowledge of outcome probabilities (relative risk = 1.80 [1.51, 2.16]), with considerable 
heterogeneity (87%). The type of control did not modify effects. Meta-regression suggested that the IPDASi 
probabilities domain score (reflecting decision aid quality) is a potential effect modifier (P = 0.037), accounting for a 
quarter of the variability (/? 2 = 0.28). Meta-regression indicated the control event rate (reflecting baseline knowledge) 
is a significant effect modifier {P = 0.001), with over half the variability in In(OR) explained by the linear relationship 
with log-odds for the control group (/? 2 = 0.52); this relationship was slightly strengthened after correcting for the 
statistical dependence of the effect measure on the control event rate. 

Conclusions: Patients' baseline level of knowledge of outcome probabilities is an important variable that explains 
the heterogeneity of effects of decision aids on improving accuracy of this knowledge. Greater relative effects are 
observed when the baseline proportion of patients with accurate knowledge is lower. This may indicate that 
decision aids are more effective in populations with lower knowledge. 

Keywords: Decision aid, Clinical heterogeneity, Meta-analysis, Meta-regression, Subgroup analysis, Effect modification, 
Baseline rate, Control event rate 
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Background 

In systematic reviews of binary outcomes, heterogeneity 
conventionally refers to the variation in relative effects 
(relative risk, odds ratio) across studies that is greater than 
one would expect by chance [1]. The causes of such 
study-level variation can either be artifactual, where meth- 
odological differences between studies affect the relative 
effect measures, or real, where differences may be attribut- 
able to variation across studies in factors related to the 
population included, active interventions used or compar- 
ators employed [2,3]. When present, unexplained hetero- 
geneity complicates the interpretation and usefulness of 
pooled effect estimates of meta-analyses in decision- 
making. It is for this reason that the quality of pooled evi- 
dence is typically downgraded when assessed using the 
GRADE framework [4]. Attempts to explain sources of 
heterogeneity are important for overcoming these limi- 
tations and for their potential to contribute knowledge 
about what types of patients benefit most from a specific 
intervention [2,3,5,6]. The Cochrane meta-analysis of ran- 
domized controlled trials (RCTs) evaluating patient deci- 
sion aid effects on the accuracy of knowledge of outcome 
probabilities is an example where interpretation of the 
pooled effect has been hampered by high heterogeneity. 

Patient decision aids are complex interventions used to 
help patients make specific and deliberative choices 
among treatment or screening options by providing, at the 
minimum, information on the options and associated out- 
comes relevant to the patients health status, and implicit 
methods to clarify their values or preferences [7]. Due to 
their complex nature - involving multiple interacting 
components and behaviors - and the diverse clinical set- 
tings they are designed for, the exact form of the interven- 
tion and populations in which they are evaluated vary 
considerably. There is thus a corresponding expectation of 
variation in real decision aid effects across conditions. 

The effects of decision aids on numerous decision- 
related outcomes have been extensively evaluated. Since 
patients are known to underestimate probabilities of 
harms or overestimate probabilities of benefits [8], deci- 
sion aids are often designed to communicate estimates 
of probabilities derived from population-based research. 
Such probabilities apply to possible outcomes of the fea- 
tured decisions: benefits and harms of an intervention, 
or true- and false-positive or -negative screening results 
[9]. Studies that evaluate the effects of decision aids on 
the accuracy of patient knowledge of these outcome 
probabilities generally measure the proportion of pa- 
tients who are able to correctly answer questions about 
population-derived probability estimations - making this 
a binary outcome. 

The most recent (2011) update to the Cochrane sys- 
tematic review on patient decision aids includes 86 RCTs 
where the authors reviewed 23 different outcomes [7]. 



Accuracy of knowledge of outcome probabilities (labeled 
accurate risk perception' in that review) was the second- 
most frequently measured outcome, and the results of 14 
studies were pooled. Meta-analysis revealed a uniform dir- 
ection of effect favoring decision aids across all studies, and 
the pooled effect estimate was significant (relative risk = 
1.74 [1.46 to 2.08], P< 0.001). The level of heterogeneity, 
however, was significant (P < 0.001) and considerable (I 2 = 
83%). Despite this, the pooled effect is considered inform- 
ative to a degree since decision aids showed a uniformly 
positive effect. However, the Cochrane review mentions 
that 'the pooled effect size and CI should be interpreted as 
a range across conditions, which may not be applicable to 
a specific condition' [10], reflecting the limitation to the in- 
terpretability and utility of the pooled random effects esti- 
mates found in meta-analyses when there exists substantial 
real variation in intervention effects. In other words, the 
pooled estimate does not correspond to any individual de- 
cision aid, setting or population. Furthermore, it is impos- 
sible to predict where any given decision aid would lie 
within the wide range of possible relative effects [3]. 

The 2011 Cochrane update [7] tentatively explored two 
sources of heterogeneity affecting this outcome. First, it 
showed that the effect size of decision aids in which prob- 
abilities were represented numerically is larger than for 
those where probabilities were described with words, 
suggesting possible effect modification attributable to this 
specific aspect of the intervention. Secondly, removing 
three (of 14) studies with the lowest control event rate (se- 
lected as outliers by visual inspection) reduced heterogen- 
eity to non-significant levels (P = 0.3), implicating control 
levels of accurate knowledge as a potential contributor to 
heterogeneity [10]. While informative, these preliminary 
analyses were not selected with any overall rationale and 
did not provide formal tests for effect modification. 

The current investigation aims to improve interpretabil- 
ity and usefulness of the available research evidence re- 
garding decisions aid effects on the accuracy of patient 
knowledge of outcome probabilities by exploring and 
characterizing potential contributors to the observed het- 
erogeneity [2-4]. Subgroup analysis and meta-regression 
were employed to investigate the potential effects of three 
study-level factors (covariates): the type of control inter- 
vention, the level of decision aid quality and the control 
event rate. These covariates were chosen because they 
represent the best available measures that summarize or 
combine relevant characteristics of the comparator (con- 
trol), active intervention or study population, respectively. 

Methods 

As a sub-analysis of the previous Cochrane systematic 
review on decision aids [7], certain aspects of the original 
methods were not repeated in detail here - principally 
the literature search and parts of the literature 
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selection. In addition, the original review can be con- 
sulted for further information on individual studies, 
including setting, patients included, intervention charac- 
teristics and risk of bias assessments. 

Data sources and study selection 

Studies previously identified through electronic database 
searches (MEDLINE, PsycINFO, CINAHL, EMBASE, 
Cochrane Central Controlled Trials Register) in the 2011 
Cochrane review served as the basis for study selection 
[7]. Thus, RCTs published up to December 2009 meeting 
the original selection criteria were considered. As an add- 
itional criterion, we included studies where data had been 
collected on the proportions of participants in both inter- 
vention and control groups who had accurate knowledge 
of outcome probabilities post-intervention. To maximize 
the number of studies available for analysis, the 86 pub- 
lications included in the 2011 Cochrane update were 
rescreened to identify studies where the relevant outcome 
data might exist but had not been previously published. 
The corresponding authors were then emailed up to three 
times requesting unpublished data used to calculate rela- 
tive risks and copies of the original decision aids. 

Data extraction 

Data from all studies were extracted in duplicate (SG, 
MA) using piloted forms. In addition to newly eligible 
studies, data were re-extracted from the set of 14 studies 
pooled in the 2011 Cochrane update [7]. In cases of dis- 
agreement with the outcome data from the previous 
Cochrane review, its authors (CB, DS) were consulted 
and consensus was reached on which data to use for the 
current review. 

Event rates, defined as the proportion of patients in the 
decision aid group correctly answering questions about 
probabilities divided by that in the control group, were 
extracted for calculating relative risk. In eight studies that 
evaluated knowledge of outcome probabilities with more 
than one question, the proportion of correct answers was 
averaged. For purposes of GRADE assessment, the risk of 
bias items applicable at the outcome level (blinding, in- 
complete outcome data, specifically for assessments of 
knowledge of probabilities) were abstracted, as these items 
were previously reported in the Cochrane update [7] only 
at the study level. Information for the three covariates ana- 
lyzed was abstracted (described below). 

Selection of study-level factors (covariates) investigated 

Study-level factors with the potential to contribute to 
heterogeneity (covariates) were considered to represent 
three principal sources of clinical heterogeneity: charac- 
teristics of the comparator (control), the active interven- 
tion and the population [2]. To minimize the risk of 
detecting spurious effect modification due to multiple 



comparisons, only one covariate was selected to represent 
each main source, to give a total of three [11,12]. In each 
case, the covariates were selected for their availability and 
biologic plausibility (likelihood based on a mechanistic ra- 
tionale) as substantial contributors to heterogeneity [11]. 
For the first category, comparator (or control), only one 
covariate was available and therefore selected: the type of 
control intervention. Since multiple covariates were avail- 
able corresponding to characteristics of active intervention 
and study population, a top-down approach was used in 
which the best available measure that combined poten- 
tially relevant characteristics was selected in each case. To 
represent intervention characteristics, a composite meas- 
ure of relevant decision aid quality characteristics was 
chosen. For population characteristics, the control event 
rate was chosen because it provides a convenient sum- 
mary measure [13]. The rationale, hypothesis and meas- 
urement for each covariate are described below. 

Type of control intervention 

Depending on the context, not all studies evaluating de- 
cision aids provide the same degree of standardized in- 
formation to the control group [7]. Three types of 
control intervention, from less to more standardized in- 
formation, are categorized: (1) no standardized infor- 
mation other than usual care; (2) generic standardized 
information used as a sham, such as basic background 
on the disease, and containing no outcome information 
or (3) information on outcomes associated with options, 
sometimes considered as a less intense form of decision 
aid. In all cases, control interventions differ from the ex- 
perimental intervention by providing no information on 
outcome probabilities. Higher levels of standardized in- 
formation may have a hidden effect on patients' ability 
to answer questions about probabilities. The hypothesis 
for this covariate was that control interventions that 
provide more standardized information to the control 
group, because they may conceivably improve control 
patients' ability to answer questions about probabilities, 
would decrease relative effect size. Possible effect modi- 
fication by this categorical covariate was investigated 
with subgroup analysis. 

Decision aid quality 

The International Patient Decision Aid Standards 
(IPDAS) collaboration has developed an instrument, the 
IPDASi, for rating the quality of decision aids [14]. 
IPDASi includes a probabilities dimension consisting of 
eight items corresponding to theoretical elements de- 
rived from systematic review of the evidence on effective 
formats for communicating outcome probabilities to 
patients [15]. The items address factors including the 
presentation of event rates, specification of a time 
period, the allowing for comparison of probabilities 
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across options, the reporting of levels of uncertainty 
around probabilities, the provision of multiple ways of 
viewing probabilities (for example, words, numbers and 
diagrams) and providing balanced information to limit 
framing biases [14]. The probabilities dimension therefore 
represents a comprehensive composite measure of rele- 
vant decision aid characteristics likely to affect knowledge 
of probabilities. Moreover, its continuous scale probably 
gives greater statistical power when testing for effect 
modification than does an equivalent categorical variable. 
The hypothesis for this covariate was that decision aid 
scores on the IPDASi probabilities dimension would in- 
crease as the effectiveness of decision aids for improving 
knowledge of outcome probabilities increases - which, if 
true, would support the predictive validity of the probabil- 
ities dimension of IPDASi [14]. Decision aids were scored 
in duplicate by trained raters on a scale from 1 to 4 points 
for each of 8 items in this dimension (scores provided by 
NJW). The possible ratings of 8 to 32 were re-scaled to a 
range of 0% to 100%. The effects of this continuous covari- 
ate were investigated with meta-regression. 

Control event rate 

The control event rate (CER) in this context is the pro- 
portion of patients in the control group who correctly 
answer specific questions about probabilities. Note, con- 
trol event rate' is used in preference to 'baseline risk' to 
minimize confusion, since 'risk' in this case corresponds 
to a favored outcome (that is, having accurate knowledge 
of probabilities). Assuming the type of control interven- 
tion does not modify its effects (and our investigations 
found no evidence that it does), the control event rate 
provides an estimate of the baseline level of accurate 
knowledge of outcome probabilities in the population 
studied. Patients' baseline knowledge of these probabil- 
ities may vary widely depending on factors such as 
whether specific probabilities are likely to be common 
knowledge, newness of the underlying evidence or pa- 
tient education levels. The plausibility of effect modifica- 
tion was first suggested in the 2009 Cochrane update 
where heterogeneity was reduced to non-significant 
levels after removing three studies with the lowest con- 
trol event rate [10]. The hypothesis for this covariate 
was that studies with higher control event rates have 
lower relative risks. Effects due to this continuous covar- 
iate were investigated with meta-regression. 

Analysis 

Three types of statistical analysis were performed: meta- 
analysis of main effects, subgroup analysis to test for ef- 
fect modification by the one categorical covariate (type 
of control intervention) and meta-regression to test for 
and characterize effect modification by the two continu- 
ous covariates (decision aid quality and control event 



rate). Each analysis type is described in further detail. 
The threshold for statistical significance was P < 0.05. 

Meta-analysis of main effects 

Consistent with previous meta-analysis of the main ef- 
fects for this outcome [7], relative risk was used as the 
effect measure. The software Review Manager (RevMan, 
version 5.1, Copenhagen, The Nordic Cochrane Centre, 
The Cochrane Collaboration, 2011) was used to combine 
estimates using the DerSimonian and Laird random- 
effects model. Tau-squared in this model provides an es- 
timate of the between-study variance. A chi-squared test 
was used to examine the strength of evidence about 
whether heterogeneity is present, and I 2 provides an esti- 
mate of its magnitude. 

Subgroup analysis (type of control intervention) 

Potential effect modification by the three types of control 
intervention was tested with a weighted one-way ANOVA. 
To provide additional support for a lack of effect on the 
control event rate, a weighted ANOVA between type of 
control intervention and control event rate was per- 
formed. ANOVAs were calculated using the software IBM 
SPSS Statistics (version 20.0 for Windows, Armonk, NY, 
IBM Corp.), using the natural logarithm of the odds ratio, 
ln(OR), as the effect measure for consistency with subse- 
quent covariate analyses. 

Meta-regression analysis (decision aid quality and control 
event rate) 

Univariate weighted least squares (WLS) meta-regression 
analyses were conducted to test for and characterize po- 
tential effect modification by IPDASi probabilities dimen- 
sion score and control event rate, separately. 

In selecting the most appropriate scales for these ana- 
lyses, the effect measure was first considered. Changing 
the effect measure (between relative risk (RR), OR, or In 
(OR)) and scale for representing the relationship has 
been recommended as a strategy to minimize apparent 
heterogeneity and effect modification as a first step in 
reducing the chance of detecting a spurious interaction 
in meta-regression where control event rate is a covari- 
ate [6,16,17]. Of the three effect measures, ln(OR) had 
the lowest heterogeneity (7 2 , Table 1) and was found in 
exploratory analyses to have the least significant slope vs 
control event rate, providing justification for using this 
effect measure in the meta-regression. As additional jus- 
tification, the natural log of the OR is commonly chosen 
because it has better statistical properties since zero is 
the value of no effect [6]. For the analysis of decision aid 
quality, ln(OR) was plotted against the re-scaled IPDASi 
probabilities dimension score (0% to 100%). For the ana- 
lysis of control event rate, ln(OR) was plotted against 
log-transformed values of the control event rate (that is, 
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Table 1 Study-level covariate values, observed effect size measures and pooled heterogeneity estimates listed in order 
of increasing relative risk 



Study 


Type of 
control 3 


IPDASi, probability 
dimension score/32 


Rescaled IPDASi, probability 
dimension score (%) 


CER 


Logit 
control 


In 

(OR) 


OR b 


RR b 


Lerman et al. [28] 


A 


16 


33 


0.66 


0.65 


0.37 


1.46 


1.12 


Johnson et al. [32] 


A 


16 


33 


0.77 


1.17 


0.96 


2.86 


1.17 


Wolf and Schorling 
[24] 


B 


19 


46 


0.54 


0.16 


0.73 


2.08 


1.31 


Whelan et al. [25] 


A 


17 


38 


0.58 


0.32 


0.91 


2.52 


1.34 


McBride et al. [27] 


A 


23 


63 


0.30 


-0.85 


0.49 


1.64 


1.37 


Schapira and 
Vanruiswyk [23] 


B 


28 


83 


0.47 


-0.10 


0.89 


2.45 


1.45 


Dodin et al. [31] 


B 


26 c 


75 


0.43 


-0.28 


0.82 


2.32 


1.48 


O'Connor et al. [8] 


C 


26 c 


75 


0.46 


-0.14 


1.05 


2.91 


1.54 


Whelan et al. [21] 


B 


22 


58 


0.37 


-0.53 


0.82 


2.29 


1.55 


Kuppermann et al. 
[36] 


C 


NA 


NA 


0.32 


-0.76 


1.35 


3.88 


2.03 


Vandemheen et al. 
[37] 


B 


30 


92 


0.29 


-0.88 


1.52 


4.67 


2.26 


McAlister et al. [29] 


A 


29 d 


88 


0.16 


-1.62 


1.11 


3.06 


2.29 


Mathieu et al. [34] 


B 


31 


96 


0.22 


-1.29 


1.54 


4.71 


2.62 


Man-Son-Hing et al., 
[30] 


A 


29 d 


88 


0.24 


-1.16 


1.83 


6.32 


2.80 


Weymiller et al. [35] 


B 


32 


100 


0.18 


-1.48 


1.88 


6.94 


3.38 


Laupacis et al. [33] 


A 


24 


67 


0.08 


-2.34 


1.50 


4.88 


3.72 


Gattellari and Ward 
[22] 


B 


21 


54 


0.10 


-2.14 


2.29 


10.26 


5.28 


Chi-squared (heterogeneity) 
I 2 










55.75 
71% 


56.41 

72% 


120.19 

87% 



a A, no standardized information; B, standardized generic information (no outcome information); C, simple decision aid (no standardized probability information), 
b No continuity correction was applied to match Review Manager's output, 
c Same decision aid for both trials, 
d Same decision aid for both trials. 

CER, control event rate; IPDASi, International Patient Decision Aid Standards instrument; NA, full decision aid not available for rating; OR, odds ratio; RR, relative risk. 



logit control) so that both variables could share the same 
scale making a linear model easier to interpret. With In 
(OR) as the common effect measure, exploratory multiple 
regression combining the CER and IPDASi probabilities 
dimension score could be performed more easily. 

Since the selected effect measure ln(OR) is not avail- 
able in RevMan, Excel was used to generate an equiva- 
lent meta-analysis for ln(OR) to obtain the tau-based 
weights for the meta-regressions. Excel formulae were 
verified by comparing (non-continuity-corrected) back- 
translated values to the RevMan output for OR. Event 
frequencies for this meta-analysis were continuity- 
corrected (adding 0.5). IBM SPSS Statistics was then 
used to calculate standard WLS regressions using the 
tau-based weights. Neither regression model (logit con- 
trol vs ln(OR) or re-scaled IPDAS vs ln(OR)) was found 
to violate the assumptions of linear regression (linearity, 
independence, homoscedasticity and normality) upon 
examination of the residual plots (predicted vs residual, 



independent vs residual and normal probability (Q-Q) 
of residual plots). 

The meta-regression against control event rate incor- 
porated a bias correction. When baseline response rates 
(control event rates) are used as the covariate in a meta- 
regression, the measurement error in control event rate 
and the functional dependence of the observed treat- 
ment effect on the control group response can bias the 
standard WLS regression and lead to incorrect inference 
about the degree to which the control event rate modi- 
fies effects and underlies heterogeneity [13,16,18]. This 
problem was addressed using a modified WLS approach 
developed previously [18], which considers sampling 
error in the control event rate and generates bias terms 
that are used to correct the standard regression coeffi- 
cients. Bias terms and bias-corrected regression coeffi- 
cients were calculated using Excel, the formulae for 
which were verified using data from the original article 
describing this approach [18]. 
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To calculate relative risk values predicted by the bias- 
corrected regression formula for corresponding control 
event rate values, back- translation was performed using 
Excel 

GRADE assessment 

The GRADE framework was employed to provide a 
standardized summary rating of the pooled evidence for 
the outcome of interest based on key quality dimensions: 
risk of bias, consistency, directness, precision and publi- 
cation bias [4,19,20]. The software GRADEpro (version 
3.2 for Windows, 2008) was used. 

Results 

Meta-analysis of main effects 

Of 86 studies from the 2011 Cochrane review, 17 studies 
were included in the current meta-analysis of the effects 
of decision aids on the accuracy of knowledge of out- 
come probabilities [8,21-37]. Efforts to obtain additional 
unpublished data resulted in three studies [32,34,35] be- 
ing added to the 14 from the 2011 Cochrane analysis for 
this outcome. The authors of three additional studies 
who were contacted either confirmed that relevant data 
was unavailable (n = 1) or were unable to provide data 
(n = 2). Figure 1 shows the main pooled relative effect 
for the outcome accuracy of patients' knowledge of out- 
come probabilities was significant, with a uniform direc- 
tion of effect favoring decision aids (relative risk = 1.80 
[1.51, 2.16]); heterogeneity was significant (P < 0.001) 
and considerable (I 2 = 87%). 

Subgroup and meta-regression analysis of covariate 
effects 

Table 1 shows the covariate values and corresponding ef- 
fect sizes for each study. For the subgroup analysis that 
tested effect modification due to the type of control 



intervention used (no standardized information, generic 
information or simple decision aid without probability in- 
formation), the weighted ANOVA was not significant (F = 
2.33, degrees of freedom, df = 2, P = 0.11). As further sup- 
port for the lack of effect of the type of control interven- 
tion on the control event rate, the second ANOVA 
between these two covariates also lacked significance (F = 
0.49, df=2,P = 0.62). 

Table 2 summarizes the relationships corresponding to 
each of the two meta-regression analyses: decision aid 
quality (rescaled IPDASi probabilities dimension score) 
vs effect size, and log-transformed control event rate vs 
effect size before and after bias correction. 

The quality (IPDASi probabilities dimension scores) of 
the decision aids evaluated in the included studies 
ranged widely from 16 to 32 out of a total possible score 
of 32 (33.3% to 100% when rescaled). The slope of the 
univariate regression relationship between the rescaled 
quality scores (%) and ln(OR) (Figure 2) was significant 
(intercept 0.253, slope 0.013, P = 0.037), and accounted 
for a quarter of the variability in effect size between 
studies (R 2 = 0.28). 

The control event rate (representing the proportion of 
control patients with accurate knowledge of outcome prob- 
abilities) ranged widely among the 17 studies, from 0.08 to 
0.77. The slope of the univariate regression between logit 
control and ln(OR) in Figure 3, prior to bias correction 
(dotted line), was significant (slope = -0.436; P = 0.001); 
this relationship was slightly steeper (that is, strengthened) 
after bias correction (solid line, slope = -0.466). In the non- 
bias-corrected analysis, the control event rate accounted 
for just over half of the variability in effect size (R 2 = 0.52). 

The multiple regression, which combined IPDASi 
probabilities dimension score and control event rate, 
was significant (P = 0.007) and accounted for slightly 
more variability (R 2 = 0.54). While effect modification 





Decision aid 


Control 




Risk Ratio 


Study or Subgroup 


Events 


Total 


Events 


Total 


Weight M- 


H, Random, 95% CI 


Lerman 1997 


90 


122 


108 


164 


7.1% 


1.12 [0.96, 1.31] 


Johnson 2006 


29 


32 


27 


35 


6.8% 


1.17 [0.95, 1.45] 


Wolf 2000 


189 


266 


72 


133 


7.0% 


1.31 [1.10, 1.56] 


Whelan 2004 


73 


94 


62 


107 


6.9% 


1.34 [1.10, 1.63] 


McBride 2002 


109 


265 


82 


274 


6.7% 


1.37 [1.09, 1.73] 


Schapira 2000 


84 


122 


64 


135 


6.8% 


1.45 [1.17, 1.80] 


Dodin 2001 


33 


52 


21 


49 


5.6% 


1.48 [1.01, 2.17] 


O'Connor 1998 


58 


81 


39 


84 


6.4% 


1.54 [1.18, 2.02] 


Whelan 2003 


47 


82 


34 


92 


6.0% 


1.55 [1.12, 2.15] 


Kuppermann 2009 


157 


244 


80 


252 


6.8% 


2.03 [1.65, 2.48] 


Vandemheen 2009 


46 


70 


23 


79 


5.6% 


2.26 [1.54, 3.31] 


McAlister 2005 


70 


187 


27 


165 


5.5% 


2.29 [1.55, 3.38] 


Mathieu 2007 


198 


351 


77 


357 


6.8% 


2.62 [2.10, 3.25] 


Man-Son-Hing 1999 


92 


139 


35 


148 


6.1% 


2.80 [2.05, 3.83] 


Weymiller 2007 


30 


50 


8 


45 


3.7% 


3.38 [1.73, 6.58] 


Laupacis 2006 


14 


47 


4 


50 


2.1% 


3.72 [1.32, 10.51] 


Cattellari 2003 


57 


106 


11 


108 


4.1% 


5.28 [2.93, 9.50] 


Total (95% CI) 




2310 




2277 


100.0% 


1.80 [1.51, 2.16] 


Total events 


1376 




774 








Heterogeneity: Tau 2 = 


0.11; Chi 


= 120.19, df = 


16 (P < 0.00001); 1 


2 = 87% 


Test for overall effect: Z = 6.47 (P < 0.00001) 









Risk Ratio 
M-H, Random, 95% CI 



Favours control Favours decision aid 

Figure 1 Main effects of decision aids on patient knowledge of outcome probabilities. CI, confidence interval; df, degrees of freedom; RR, 
relative risk. 
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Table 2 Regression coefficients for normalized IPDASi 
probabilities dimension score vs In(OR) and logit control 



vs In(OR) 




Intercept 


Slope 

(standard errorcpa 


(a) Normalized IPDASi probabilities 
score vs In(OR) 


0.25 


0.013 (0.006) 


(b) logit control vs In(OR), non-bias- 
corrected 


0.86 


-0.436 (0.108) 


logit control vs In(OR), bias- 
corrected 


0.88 


-0.466 



IPDASi, International Patient Decision Aid Standards instrument; OR, odds ratio. 



due to control event rate was still significant in this 
model (P = 0.018), IPDASi probabilities dimension score 
lost significance (P = 0.561). 

GRADE assessment of evidence quality 

The quality of the evidence supporting the use of deci- 
sion aids for improving the accuracy of patient know- 
ledge of outcome probabilities was assessed here as 
moderate' with the GRADE framework (Table 3). Dis- 
regarding any explanation of sources of heterogeneity 
provided in the current study, the same body of pooled 
evidence would be assessed as low' (GRADE table not 
shown) due to rating down for 'inconsistency. 

Discussion 

Our analysis of main effects of decision aids on the accur- 
acy of patient knowledge of outcome probabilities includes 
unpublished data from three studies in addition to the 14 
studies previously included in the 2011 Cochrane analysis 
for this outcome. Compared to this earlier analysis, the 
added data slightly increase the pooled relative risk (from 
1.74 to 1.80) and maintain the finding that all studies 



uniformly favor decision aids; additionally, they slightly in- 
crease the level of heterogeneity (from I 2 of 83% to 87%) 
[7]. As recognized in the previous Cochrane review [10], 
this substantial level of heterogeneity limits the interpret- 
ability of the random effects pooled estimate since it rep- 
resents an average of possible real effects of decision aids 
that vary widely from setting to setting. Thus an investiga- 
tion of the factors that may influence this variation is 
warranted to better understand the conditions under 
which decision aids have their greatest effects [2,3,5,6]. 

Given that factors underlying real variation of inter- 
vention effects can include study-level characteristics of 
the comparator or control intervention, the active inter- 
vention or the study population [2,3], the current investi- 
gation therefore analyzed the effects of three covariates 
chosen to represent each of these sources of variability. 
There was no evidence that the type of control interven- 
tion modifies either the effect size or the control event 
rate. This negative finding provides incidental support for 
an assumption integral to the third covariate analysis of ef- 
fect modification by control event rate (see Methods). 
That is, any effect modification is unlikely to be con- 
founded by the control intervention manipulating effect 
size via effects on the control event rate. Thus the control 
event rate can be more reliably interpreted as representing 
a study populations baseline level of knowledge of out- 
come probabilities. 

The second covariate, decision aid quality as repre- 
sented by the IPDASi probabilities dimension score, was 
found to modify effect size, the positive relationship ob- 
served being consistent with the expectation that higher- 
quality decision aids produce larger effect sizes. Overall, 
this result provides tentative support for the predictive 
validity of the probabilities dimension of the IPDASi, 




o -1 1 1 1 1 1 1 1 

30 40 50 60 70 80 90 100 

IPDAS Probabilities Dimension Score (%) 

Figure 2 Meta-regression of the effect of decision aid quality: normalized IPDASi probabilities dimension score vs In(OR). Kuppermann 
et al. [36] is excluded since this decision aid was not available for scoring on the IPDASi probabilities dimension. The area of each circle is 
proportional to the weight for that study. IPDASi, International Patient Decision Aid Standards instrument; OR, odds ratio. 

V J 
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-0.5 
Logit control 

Figure 3 Meta-regression of the effect of control event rate: logit control vs In(OR). The dashed line is prior to bias correction. The solid 
line is after bias correction. The area of each circle is proportional to the weight for that study. OR, odds ratio. 



although statistical significance is borderline (P = 0.037). 
Significance is lost, for example, when IPDASi probabil- 
ities dimension score is combined in multiple regression 
with control event rate - although for any bivariate regres- 
sion to be sufficiently powered, a larger sample size would 
generally be advisable. Thus, additional studies are neces- 
sary to improve certainty regarding effect modification 
due to the IPDASi probabilities dimension score. 

Nevertheless, there are reasons to expect that decision 
aid quality defined according to the IPDASi probabilities 
dimension does in reality modify the effectiveness of deci- 
sion aids. Firstly, individual components of decision aid de- 
sign on which the IPDASi probabilities dimension are 
based [14] are supported by a review of evidence providing 
biologic or theoretical plausibility [15]. Secondly, subgroup 
analysis in the 2011 Cochrane review provides direct evi- 
dence for at least one design feature - that using numbers 
rather than words in decision aids to communicate prob- 
abilities improves knowledge of those probabilities to a sta- 
tistically significantly greater extent [7]. The components 



of decision aid design that may underlie variation in effect 
sizes are not restricted to the IPDASi probabilities domain, 
however, and the updated IPDAS review summarizing re- 
cent evidence for presenting probabilities [9] describes 
additional promising factors to explore in future analyses 
of effect modification. The effects of individual design fac- 
tors were not examined here because of the top-down ap- 
proach to selecting covariates and the decision to restrict 
their number to minimize the risk of detecting spurious ef- 
fect modification due to multiple comparisons. The selec- 
tion of specific factors for future analyses should consider 
both the theoretical plausibility of effect modification, and 
whether the selected design feature is likely to be consist- 
ently relevant for all decision aids since some features, 
such as those relevant only to screening decisions, restrict 
the sample size of studies available for analysis. 

For the third covariate, control event rate, the nega- 
tively sloped relationship is highly significant (P = 0.001) 
and is slightly steeper after correcting for dependence 
of the effect measure on the control event rate, increasing 



Table 3 GRADE [20] evidence quality assessment for the effect of decision aids on the accuracy of patient knowledge 
of outcome probabilities 



Risk of bias 


Inconsistency 


Indirectness 


Imprecision 


Publication bias 


Quality 


Serious 3 


No serious inconsistency 13 


No serious indirectness 


No serious imprecision c 


Unlikely 01 


eeeo 

MODERATE 



a. Study-level risk of bias assessments reported in the 201 1 Cochrane update were used, except for two newly extracted risk of bias items where outcome-level 
assessments were more appropriate (blinding and incomplete outcome data). No studies had a high risk of bias due to sequence generation (11 were unclear and 
6 low), or allocation concealment (6 were unclear and 1 1 low). Only two studies representing a small weighting in the pooled analysis (Lerman et ol. [28] and 
McAlister et al. [29]) had a high risk of bias due to incomplete outcome data (1 was unclear and 14 low). Ten studies could be considered to have a risk of bias 
due to inadequate blinding of outcome assessment (4 were evidently unclear and 3 were evidently low), but for these studies the accuracy of knowledge of 
outcome probabilities was generally assessed using objective a priori criteria, thus inadequate blinding of outcome assessment was not considered serious. Most 
studies of decision aids do not blind the personnel delivering the intervention, and risk of bias was rated down for this reason. 

b. Inconsistency was not rated down since there was a uniform direction of effect with all studies favoring decision aids and a large proportion of the 
heterogeneity is explained by the variation in the control event rate. 

c. Imprecision was not rated down since the confidence intervals for the pooled RR are uniformly greater than 1.25 (with greater relative effects predicted for 
lower control event rates), and this estimate is based on over 2,000 patients in each arm. 

d. Investigation of reporting bias using funnel plots was not feasible for this outcome. Reporting bias was considered unlikely based on the thoroughness of the 
search and discussion provided in an earlier Cochrane update. 
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confidence in true effect modification. Furthermore, when 
combined in multiple regression with IPDASi probabilities 
dimension scores added to the model, control event rate 
remained significant despite the low power for this bivariate 
analysis. In univariate analysis, approximately half the het- 
erogeneity is accounted for by the control event rate. Thus 
control event rate, reflecting patients' baseline level of know- 
ledge of outcome probabilities, appears to be an important 
variable explaining heterogeneity of effects of decision aids 
on accuracy of knowledge of outcome probabilities, with 
greater relative effects expected when the baseline propor- 
tion of patients with accurate knowledge is lower. 

The precise relationship between control event rate and 
effect size is not intuitive from the meta- regression in 
Figure 3, since both variables are on the logarithmic scale. 
To facilitate interpretation, the relationship was back- 
translated to show how the effect sizes are expected to 
vary over a range of control event rates, using relative risk, 
the effect measure commonly reported in the literature. 
The relationship thus represented in Figure 4 could have 
various predictive uses, such as for planning future trials 
evaluating decision aids. For example, when a control 
event rate of 0.5 is anticipated based on pilot work, the 
corresponding expected relative risk (of 1.4) could inform 
decisions about proceeding with the full trial. 

Given the clinical utility of being able to define what 
types of patients benefit most from an intervention 
using the relationship between effect size and the con- 
trol event rate (or level of baseline risk), one may ask 
how often is such a relationship significant for interven- 
tions in other contexts, and why is it not characterized 
more frequently? An analysis by Schmid and colleagues 



provides an informative answer [5]. They examined 115 
meta-analyses of clinical trials to detect whether there was 
an effect of control event rate on effect size. After 
correcting for dependence of the effect measure on con- 
trol event rate and using a two-standard error rule of sig- 
nificance, they found linear correlations with ln(OR) in 
only 14% of meta-analyses. They proposed that such effect 
modification is more likely to be found when the meta- 
analysis includes a sufficient number of studies (ten or 
more), and comprises greater variation in control event 
rates across included studies. The current meta-analysis, 
which includes 17 studies and has widely ranging control 
event rates (0.08 to 0.77), is consistent with this observa- 
tion. This follows from the idea that 'heterogeneity is your 
friend' since more heterogeneity provides a better oppor- 
tunity to detect a covariate effect [38,39] . 

Finally, by providing an explanation for heterogeneity, 
the quality of the pooled research was assessed with the 
GRADE framework [19,20] as moderate' instead of 
low'. This reflects that the current investigation of 
sources of heterogeneity improves the quality of the evi- 
dence from a body of 17 pooled studies by improving its 
interpretability and utility [2-4], 

A limitation of investigating study-level sources of het- 
erogeneity is that interpretation may be affected by 
confounding from other study-level factors, particularly 
those related to study design [2]. These factors include 
both methodological aspects known to increase the risk of 
bias in an RCT (sequence generation, allocation conceal- 
ment, blinding of patients, blinding of intervention pro- 
viders, blinding of outcome assessors, and completeness 
of outcome data) and aspects of outcome measurement 




Figure 4 Empirically fitted relationship predicting relative risk when the control event rate (baseline knowledge) is known. CER, control 
event rate; RR, relative risk. 
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[2]. Confounding by aspects of outcome measurement re- 
quires considering characteristics of the questions used to 
measure knowledge of probabilities. Evaluation questions 
can vary, for example, in the number of categories to se- 
lect between within a multiple choice question, whether 
the question forces guessing (by not providing an option 
for unsure'), whether numbers or words are used to rep- 
resent probabilities, and in the precision used to define 
the probability ranges for each category. Some but not all 
of these characteristics are also design features of decision 
aids whose influence on improving knowledge of probabil- 
ities has been established - for example, that presenting 
probabilities as numbers is more effective than words 
[7,40,41]. Similarly, specific characteristics would be 
expected to influence the difficulty of evaluation questions. 
Although there is extensive research and standards that 
guide and support the presentation of probabilities in de- 
cision aids [14,15], research into how relevant characteris- 
tics affect the difficulty of evaluation questions - and 
therefore influence the measurement of patient knowledge 
of probabilities - is lacking. It was not possible to conduct 
an analysis of these effects in the current study since de- 
scriptions of evaluation questions were not available for 
most studies. Considering how question difficulty has the 
potential to influence and confound estimates of baseline 
knowledge (control event rates), future research into this 
measurement issue is warranted. 

Conclusions 

The current sub-analysis increases the interpretability and 
utility of previously pooled evidence on the effects of deci- 
sion aids for improving accuracy of knowledge of outcome 
probabilities by adding data for this outcome and charac- 
terizing the effects of two potential contributors to hetero- 
geneity of decision aid effects: decision aid quality and the 
control event rate. While decision aid quality, as measured 
by the IPDASi probabilities dimension, may increase the 
effects of decision aids, this finding is of borderline signifi- 
cance and requires further analysis with data from add- 
itional studies. The control event rate - representing 
patients' baseline level of knowledge of outcome probabil- 
ities - is a highly significant and substantial contributor to 
heterogeneity, with greater relative effects observed when 
the baseline proportion of patients with accurate know- 
ledge is low. This suggests that decision aids are most 
effective in populations with low awareness. Further re- 
search may be warranted, however, to determine whether 
aspects of evaluation questions influence the measure- 
ment of knowledge of probabilities. Knowledge of how 
relative risk is expected to vary across a wide range of con- 
trol event rates may be useful to inform policy judgments 
about the uptake of decision aids to inform patients of 
probabilities related to the outcomes of interventions or 
diagnostic tests in specific settings. 
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